parametric learning
On Optimal Generalizability in Parametric Learning
We consider the parametric learning problem, where the objective of the learner is determined by a parametric loss function. Employing empirical risk minimization with possibly regularization, the inferred parameter vector will be biased toward the training samples. Such bias is measured by the cross validation procedure in practice where the data set is partitioned into a training set used for training and a validation set, which is not used in training and is left to measure the out-of-sample performance. A classical cross validation strategy is the leave-one-out cross validation (LOOCV) where one sample is left out for validation and training is done on the rest of the samples that are presented to the learner, and this process is repeated on all of the samples. LOOCV is rarely used in practice due to the high computational complexity. In this paper, we first develop a computationally efficient approximate LOOCV (ALOOCV) and provide theoretical guarantees for its performance. Then we use ALOOCV to provide an optimization algorithm for finding the regularizer in the empirical risk minimization framework. In our numerical experiments, we illustrate the accuracy and efficiency of ALOOCV as well as our proposed framework for the optimization of the regularizer.
Reviews: On Optimal Generalizability in Parametric Learning
This paper proposes an efficiently computable approximation of leave-one-out cross validation for parametric learning problems, as well as an algorithm for jointly learning the regularization parameters and model parameters. These techniques seem novel and widely applicable. The paper starts out clearly written, though maybe some less space could have been spent on laying the groundwork, leaving more room for the later sections where the notation is quite dense. Can you say anything about the comparsion between ALOOCV and LOOCV evaluated on only a subset of the data points (as you mention in l137-140), both in terms of computation cost and approximation accuracy? Other comments: l75: are you referring to PRESS?
Learning with Target Prior
In the conventional approaches for supervised parametric learning, relations between data and target variables are provided through training sets consisting of pairs of corresponded data and target variables. In this work, we describe a new learning scheme for parametric learning, in which the target variables \y can be modeled with a prior model p(\y) and the relations between data and target variables are estimated through p(\y) and a set of uncorresponded data \x in training. Specifically, LTP learning seeks parameter \t that maximizes the log likelihood of f_\t(\x) on a uncorresponded training set with regards to p(\y) . Compared to the conventional (semi)supervised learning approach, LTP can make efficient use of prior knowledge of the target variables in the form of probabilistic distributions, and thus removes/reduces the reliance on training data in learning. Compared to the Bayesian approach, the learned parametric regressor in LTP can be more efficiently implemented and deployed in tasks where running efficiency is critical, such as on-line BCI signal decoding.
Learning with Target Prior
Wang, Zuoguan, Lyu, Siwei, Schalk, Gerwin, Ji, Qiang
In the conventional approaches for supervised parametric learning, relations between data and target variables are provided through training sets consisting of pairs of corresponded data and target variables. In this work, we describe a new learning scheme for parametric learning, in which the target variables $\y$ can be modeled with a prior model $p(\y)$ and the relations between data and target variables are estimated through $p(\y)$ and a set of uncorresponded data $\x$ in training. Specifically, LTP learning seeks parameter $\t$ that maximizes the log likelihood of $f_\t(\x)$ on a uncorresponded training set with regards to $p(\y)$. Compared to the conventional (semi)supervised learning approach, LTP can make efficient use of prior knowledge of the target variables in the form of probabilistic distributions, and thus removes/reduces the reliance on training data in learning. Compared to the Bayesian approach, the learned parametric regressor in LTP can be more efficiently implemented and deployed in tasks where running efficiency is critical, such as on-line BCI signal decoding. We demonstrate the effectiveness of the proposed approach on parametric regression tasks for BCI signal decoding and pose estimation from video.
On Optimal Generalizability in Parametric Learning
Beirami, Ahmad, Razaviyayn, Meisam, Shahrampour, Shahin, Tarokh, Vahid
We consider the parametric learning problem, where the objective of the learner is determined by a parametric loss function. Employing empirical risk minimization with possibly regularization, the inferred parameter vector will be biased toward the training samples. Such bias is measured by the cross validation procedure in practice where the data set is partitioned into a training set used for training and a validation set, which is not used in training and is left to measure the out-of-sample performance. A classical cross validation strategy is the leave-one-out cross validation (LOOCV) where one sample is left out for validation and training is done on the rest of the samples that are presented to the learner, and this process is repeated on all of the samples. LOOCV is rarely used in practice due to the high computational complexity.
Learning with Target Prior
Wang, Zuoguan, Lyu, Siwei, Schalk, Gerwin, Ji, Qiang
In the conventional approaches for supervised parametric learning, relations between data and target variables are provided through training sets consisting of pairs of corresponded data and target variables. In this work, we describe a new learning scheme for parametric learning, in which the target variables $\y$ can be modeled with a prior model $p(\y)$ and the relations between data and target variables are estimated through $p(\y)$ and a set of uncorresponded data $\x$ in training. We term this method as learning with target priors (LTP). Specifically, LTP learning seeks parameter $\t$ that maximizes the log likelihood of $f_\t(\x)$ on a uncorresponded training set with regards to $p(\y)$. Compared to the conventional (semi)supervised learning approach, LTP can make efficient use of prior knowledge of the target variables in the form of probabilistic distributions, and thus removes/reduces the reliance on training data in learning. Compared to the Bayesian approach, the learned parametric regressor in LTP can be more efficiently implemented and deployed in tasks where running efficiency is critical, such as on-line BCI signal decoding. We demonstrate the effectiveness of the proposed approach on parametric regression tasks for BCI signal decoding and pose estimation from video.
- North America > United States > New York > Rensselaer County > Troy (0.04)
- North America > United States > New York > Albany County > Albany (0.04)
- South America > Argentina (0.04)
- (3 more...)